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ABSTRACT 
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and that will provide local and abundant information just as the 
information functions do in comparison with the test reliability 
coefficient in classical mental test theory. In so doing, validity 
indices for different purposes of testing and those that are tailored 
for a specific population of examinees are considered. The resulting 
indices should not be incidental as tnose in classical mental test 
thficry are; they are truly attributes of the item and the test. Six 
figures illustrate the discussion. (Author/SLD) 
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I Introduction 



In classical menta! test theory, the reliability and the validity coefficients of a test are considered to 
be two essential topics. In modem mental test theory, or in latent trait models, this is not the case, 
however. In particular, test validity is one concept that has been neglected in the context of latent trait 
models. 

Several types of validity have been identified and discussed in clabsical mental test theory, which 
include content validity, construct validity, and criterion-oriented validity. Perhaps we can say that, in 
modem mental test theory, both content validity and construct validity are well accomodated, although 
they are not explicitly stated. If each item is based upon cognitive processes thtl are directly related 
to the ability to be measured, then the content of the operationally defined latent variable behind 
the examinees' performances will be validated. Also constmct validity can be identified, with all the 
mathematically sophisticated structures and functions which characterif e latent tr^t models and which 
classical mental test theory does not provide. 

With respect to the criterion-oriented validity, however, so far latent trait models have not offered 
so much as they did to the test rdwbiKty_andJo-the.standard-crror'of-medsuremeht'(^^^ Samejima, 
1977rl990):'EVom 'tK scientific point cf view, however, we need to confirm if, indeed, the test measures 
what it is supposed to measure, even if wc have chosen our items carefully enough in regard to their 
contents, and even if we are equipped with highly sophisticated mathematics. 

In classical mental test theory, the validity coefficient is a single number, i.e., the product-moment 
correlation coefficient between the test score and the criterion variable. Researchers tend to put too 
much faith in the validity coefficient, or in the reliability coefficient, however. The correlation coefficient 
is largely affected by 'the heterogeneity of the group of examinees, i.e., for a fixed test the coefficient 
tends to be higher when individual differences among the examinees in the group are greater, and vice 
versa (cf. Samejima, 1977). Thus we must keep in mind that so-edUd test validity represents the degree 
of heterogeneity in ability among the examinees tested, as well as the quality of the test itself. 

By virtue of the population-free nature of latent trait theory, we should be able to find some indices 
of item validity, and of test validity, which are not affected by the group of examinees. The resulting 
indices should not be ineidentd as those in classical mental test theory are, but truly be attributes of 
the item and the test themselves. 

In the present research an^ attempt has been made to obtain such population-free measures of item 
validity and of test validity, which are basically loedly defined. 

II Performance Functions Regression of the External Crite- 
rion Variable on the Latent Variable 

Let 0 be ability, or latent trait, which assumes ai"'' real number. We assume that there u\ a set 
of n test items measuring 6 whose characteristics arc known. Let g denote such an item, kg be 
a discrete item response to item g , and Pkg{6) denote the operating eharaeteristic of kg , or the 
conditional probability assigned to kg ^ given 6 , i.e., 

(2.1) PkA0)-^Prob.[kg\0] . 

We assume that Pkgi^) ^ three-times differentiable with respect to 0 . We have for the ittm response 
information function 



(2^) ikAe) = -l^iogPk,{e) , 

and the item information function is defined as the conditional expectation of hgi^) t given 6 , so 
that we can write 

(2.3) /<,(<?) = E[i,,{e) I e] = 'Zik,{e)p,^{e) . 

In the special case where the item g is scored dichotomously, this item icformation function is simpUfied 
to become 

(2.4) m = [{p.mii - my]-' , 

where 7^(0) is the operating characteristic of the correct answer to item g * 
Let V be a response pattern such that 

(2.5) V = {kgy g = l,2 n. 

The operating characteristic, iV(^) i of the response patten V is defined as the conditional probability 
of V , given B , and-by virtue of locd independence we can write 

(2.6) P^{6)=l[P^M • 

kgtV 

The response pattern information function^ Iv{B) , is given by 

(2.7) , Jv {6) = log iV (5) = E Ik, {e) , 

and the test information function^ J{6) , is defined as the conditional expectation of Iv{^) t given 6 , 
and we obtain &om (2.2), (2.3), (2.5), (2*6) and (2.7J 

(2.8) m = E[Iv{B) I B] = J^JviB)^^) = E/a(«) • 

V (7=1 



A big advantage of the modern mental test theoiy is that the standard error of estimation can locally 
be defined b^ using [/(5)]"^/^ • Unlike in classical mental test theory this function does not depend 
upon the population of examinees, but is solely a property of the test itself, which should be the way if 
we call it the standard error, or the reliability, of a test. It is well known that thb function provides 
us with the asymptotic standard deviation of the conditional distribution of the maximum likelihood 
estimate of B , given its true value* 

It is assumed that there exists an external criterion variable, which can be measured directly or 
indirectly* This is the situation which is also assumed when we deal with criterion-oriented validity or 
predictive validity in classical mental test theory* 
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FIGURE 2-1 
Relationships among 6 , 7 , pa , and f (fl) . 
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FIGURE 2-2 

Two Hypothetical Performance FttnctionB , One of Wl-ch Is Not Likely to Be 
the Case (Solid Line), ind the Other Has a Derivative Equal to Zero at One Point 

of 0 (Dashed Line). 
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Let 7 denote the eriterion vami/c, representing the performance in a specific job, etc. We shall 
consider the conditional density of the criterion perfomance, given ability, and denote it by ^(7 | 0) • 
The performance function^ s{B) , can be defined as the regression of 7 on 5 , or by taking, say, the 
76, 90 or 95 percentile point of each conditional distribution of 7 , given 0 , Let Pa denote the 
probability which is large enough to satisfy us as a confidence level. Thus we can write 

where 7 denotes the least upper bound of the criterion variable 7 • 

Figure 2-1 illustrates the relationships among 5 , 7 , pa 1 6(7 I ^) and f (5) . It may be reasonable 
to assume that the functional relationship between 6 and i{6) is relatively simple, not as is illustrated 
by the solid line in Figure 2-2, i.e., we do not expect f (5) to go up and down frequently within a 
relatively short range of 5 . We shall assume that i[6) is twice differentiable with respect to 0 . 

In dealing with an additional dimen£;ion or dimensions, Le., the criterion variable or variables, in 
latent space, one of the most difficult things is to keep the population-free nature, which is characteristic 
of the latent trait models, the main feature that distinguishes the theory from classical mental test 
theory, among others. If we consider the projection of the operating characteristic of a discrete item 
response on the criterion dimension, for example, then the resulting operating characteristic as a function 
of 7 has to be incidtntalf for it'has to-be,affected by the population distribution of 6 • 

Wo need to start from the conditional distribution of 7 , given ^Vth^reforCj^which can be conceived 
of as being intrinsic in the relationship between the two variables, and independent of the population 
distribution of 0 « 

We assume that f (5) takes on the same value only at a finite or an enumerable number of points 
of 6 . Let P^^(f) be the conditional probability assigned to the discrete response kg , given f ♦ We 
can write 

(2.10) P;,W= X) P,,i9) . 

III When ^{6) Is Strictly Increasing in 9 : Simplest Case 

[ni.l] Amounts of Item and Test Information for a Fixed Value of $ 

The simplest case is that i{6) is strictly increasing in . In this case, ({6) has a one-to-one 
conrcspondence with 6 , and (2.10) becomes simplified into the form 

If, in addition, dO/d^ rf. finite throughout the entire range of 6 , then we obtain 

I*^* ^kffis) ^® response information function defined as a function of f . We can write 
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Let ^{i) ^ii) be the amounts of informatiou given by a single item g and by the total 
test, respectively, for a fixed value of f . Then we have from (2.3), (2.8) and (3.3) 

(3.4) /;(f) = £?[i2,(f) I f] = 2; W = i^i^) (f)' 

kg ^ 

and 

If we take the square roots of these two information functions defined for f , then we obtain 

(3.6) 

and 

(3.7) ir(f)F=' = I/(e))^/^| . 

Since a certain constant nature e^dsts for the square root of the item information function while the 
same is not true with the original item bformation function (cf. Samejima, 1979, 1982), [ij(f)]^/^ 
given by (3^6)"instead-'of-the^original.function given by (3.4) may be more useful in some occasions. 
This will be discussed later in this section, when the validity in sclection"p/u5 classification-is discussed.^ 



[ni.2] Validity in Selection 

Suppose that we have a critical value, 70 1 of the criterion variable, which is needed for succeeding 
in a specified job, and that we try to accept applicants whose values of the criterion varii^Me are 70 or 
greater. If our primary purpose of testing is to make an accurate selection of applicants, thea (3.6) and 
(3.7) for f = 7o I or their squared values shown by (3.4) and (3.5), indicate item and test validities, 
respectively. In other words, if for some item formula (3.6) or (3.4) assumes high values at f = 70 1 
then the standard error of estimation of f around f = 70 becomes small and chances are slim that we 
make misclassifications of the appUcants by accepting unqualified persons and rejecting qualified ones, 
and vice versa. The same logic applies to the total test ^y using formula (3.7) or (3.5) instead of (3.6) 
or (3.4). 

It should be noted in (3.6) or in (3.7), that (/^(r)]^/^ or [/•(r)^/^ consists of two factors, Le., 1) 
the square root of the item information function Ig{6) or that of the test information function I{0) 
and 2) the partial derivative of ability 6 with respect to f at f = 70 . These two factors in eadi 
formula are independent of each other, Le., one belongs to the item or to the test and the other to the 
statistical relationship between 6 and 7 . We also notice that these two factors are in a supplementary 
relationship, i.e., even if one assumes a small value the other can supplement it in order to make the 
resulting product large. Thus while it is important to have a large amount of item information, or of 
test information, it is even more so to have large values of the derivative, 36 /d( , in the vicinity of 
f = 70 , for this will increase the amoimt of item information defined with respect to f uniformly 
in that vicinity, and also that of test information, as is obvious from the right hand sides of (3.6) and 



(3.7). In other words, H is desirable for the purpose of selection for ( to increase slowly in B in the 
vicinity of f = 70 ♦ 

Since, in general, the same ability B has predictabilities for more ihan one kind of job performance, 
or of potential of achievement, ike performance function varies for different criterion variables. No t 
that neither \If,{B)]^'^ nor [I{B)]^/^ is changed even when the criterion variable is switched. Thus, 
for a fixed item or test whose amount of information is reasonably large around f = 70 , the derivative 
dB/d{ in the vicinity of f = 70 determines the appro^-riateness of the use of the item or of the test for 
the purpose of selection with respect to a specific job, etc. If this derivative assumes a high value, then 
an item or a test which provides us with a medium am> ,un^ of information may be acceptable for our 
purpose of selection, while we will need an item or a test whose amount of information is substantially 
larger if the derivative is low. Also for the same criterion variable 7 the derivative dB/d^ varies for 
different values of 70 , so the appropriateness of an item or of a test depends upon our choice of 70 , 
too. 

The above logic also applie* for the fonnulae (3.4) and (3.5), ie., for tha case m which we choose 
the information functions, instead of their square roots, changing dB/d^ to its squared value. 

It is obvious from (3.4) and (3.5) that we can choose either /(;(^'(7o)) or lIu{B{'^o))Y^^ for use in 
item selection, for their rank orders across different items are identical, and they equal the rank orders 
of i;(lo) as weU as those of [i;{io)Yf^ . 

[ra.3] Validity in Selection Plus Classification 

If we take another standpoint that our p^r^ose of testing is not only to make a right selection of 
applicants but also to predict the d*5ree of success in the job for each selected individual, then we will 
need to integrate [/•(f)]^/^ and I/*(f)]^/^ , respectively, since we must estimate f accurately not 
only around f = 70 but also for f > 70 • If we choose {/^(f)]^/^ and [i'*(f)l^/^ in preference to 
then" squared vdues, we will obtain from (3.6) and (3.7) 

and 

M f |r (?)]'/' / [m]'f''dB , 

where (1^ and Qo indicate the domains of f and B for which ({B) > 70 , respectively. In other 
words, when our purpose of testmg is not only to make an accurate 8e?.ection among the applicants but 
also to discriminate their ability accurately for future purposes among those \/2io were accepted with 
respect to the criterion variable 7 , we need to select items which assume hijh values of (3.8) mstead 
of (3.6), or a test which provides us with a high value of (3.9) in place of (3.7). 

Note that formulae (3.8) and (3.9) imply that we can obtain these two validity me^ures directly 
from the original item and test information functions, respectively, i.e.j xoithout actually transforming 
B f , as long as we can identify the domain 0^ ♦ This is true for any criterion variable 7 . 

Some examples illustrating the values of (3.8) are given in Figure 3-1 for hypothetical items. In the 
simplest case observed in this section and illustrated in Figures 2-1 and 3^1, these two domains, (1$ 
and 0^ , are provided by the two intervals, (^o } 00 ) and (70 1 7) , where 

(3.10) 5o=«(7o) 
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FIGURE 3-1 



Some Examples of the Relationslup between 70 «ad the Item Validity Measore 

Given by (3.8). 
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FIGURE 3-2 

Rchtioiiship between 70 I^ci Validity Indicated hy (3*8) for Three Hypothetical 
Dichotomoi;8 Items Whode Operating Characteristict for the Correct Answer Are 
Strictly lacreaaing with Zero and Unity aa Their Asymptotes* 
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and 7 denotes the least upper bound of 7 . 

It shoiild be noted that this pair of validity measures depends upon our choice of the critical value 
7o • If this value is low, Le., a specified job does not require high levels of competence with respect to 
the criterion variable 7 , then these validity indiceo assume high values, and vice versa. It has been 
pointed out (Samejima, 1979, 1982) that there is a cert^ constancy in the amount cf information 
provided by a sir^Ie test item. To give some examples, if an item is dichotomously scored and has a 
strictly increasing operating characteristic for success with sere and unity as its two asymptotes, then 
the area under the curve for equals k , regardUss of the mathtmatical form of the operating 

characteristic and its parameter values; if it follows a three-parameter model with the lower asymptote, 
Cg (> 0) , then this area is less than r and strictly decreasing in, and solely dependent upon, Cg . We 
can see, therefore, that if our item^ belong to the first type then the functional relationship between 
7o and the item validity measure given by (3.8) will be monotone decreasing, with t and sere as its 
two asymptotes, for each and every item. Figure 3-2 illustrates this relationship for three hypothetical 
items of this type. As we can see in this figure, the appropriateness of the items changes with 70 iii an 
absolute sense, and also relatively to other items with 70 1 and the rank orders of desirability among 
the items depend upon our choice of 70 . 

We can see from (3.8) that this validity measure necessarily assumes a high value if an item is 
difficult, and the same applies to (3.9) for the total test. This implies that these validity measures alone 
cannot indicate the desirability of an item and of a test precisely for a specific population of examinees. 
In selecting items or a test, therefore, it is desirable to take the ability distribution of the examinees 
into account, if the informatiop coiiceming the ability distribution of a target population is more or less 
available. In so doing we shall be able to avoid choosing items which are too difficult for the target 
population of examinees. 

Let f{6) denote the density function of the ability distribution for a specific population cf examinees, 
/*(?) be that of f for the same population. Then we can write ^ 



Adopting this as the weight function, from (3.6) and (3.7) we obt^ as the validity indices tailored for 
a specific population of examinees 



Thus by using (3.12) and (3.13) instead of (3.8) and (3.9) we shall be able to make appropriate item 
selection and test selection for a target population or sample, provided that the information concerning 
its ability distribution is more or less available. Note that, unlike (3.8) and (3.9), we need dO/dc 
in evaluating these measures given by (3.12) and (3.13). Thus not only are these validity measures 
specific for the ability distribution of a target population, but also they are heavily dependent upon the 
functional formula of i[6) . 

If we choose to use the area under the curve of the information function instead of that of its square 
root, we obtain from (3.4) and (3.5) 



(3.11) 




(3.12) 




and 



(3.13) 
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(3.14) 



and 



(3.1S) 



respectively* We notice that in this case, aniike those of (3*8) and (3*9), the integrands of the right 
hand sides of (3*14) and (3*15) are no longer independent of the functional formula of ${8) * Also when 
information about the ability distribution of a target population <^ examinees is more or less available, 
the 'tailored^ item and test validity indices become 



respectively, if we choose to use the infomation functions instead of their square roots* 

Note that, tmlike the validity measures for "selection* purposes, in the present situation the rank 
orders of validity across different items, or different tests, depend upon the choice of the validity index* 
Thus a question is: which of the formulae, (3*8) or (3*14), and (3*9) or (3*15), are better as the item and 
the test validity indices for "selection plus classificr*tion' purposes? A similar question is also addressed 
with respect to (3*12) and (3*16), and to (3*13) and (3*17)* These are tough questions to answer* While 
the choice of the square root of the item information function has an advantage of a certain constancy 
which has been observed earlier in this subsection, the use of the item information has a benefit of 
additivity, i*e*, by virtue of (2*8) the sum total of (3*14) over all the item g 's equals (3*15), and 
the same relationship holds between (3*16) and (3*17)* The answers to these questions are yet to be 
searched* 

[m.4] Validity in Classification 

When our purpose of testing is strictly the classification of individuals, as in assigning those people 
to different training programs, in guidance, etc*, (3*8) and (3*9), or (3*14) and (3*15), also serve as the 
validity measures of an item and of a test, respective^* In this case, we must set 70 = ^ defining 
the domains, and (1$ , where 2^ is the greatest lower bound of 7 * Thus the two domains, 0^ 
and (1$ i in these formulae become those of ( and S for which ^ < ([$) < 7 * It is obvious that 
these formulae provide us with the item and the test validity measures, respectively, for the same reason 
explained in [111*3]* 

The same logic applies for the "tailored* validity measures provided by (3*12) and (3*13), and by 
(3*16) and (3*17), when the information concerning the abil" y distribution of a target population is 
more or less available* 

[in.5] Computerized Adaptive Testing 

The item information function, Ig{6) , has been used in the computerized adaptive testing in 
sele'tting an optimal item to tailor a sequential subtest of items for an individual examinee out of the 





and 
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preaiTanged it^mpool. A procedure may be to let the computer choose an item having the highest value 
of Ig{6) at the current estimated value of 0 for the individual examinee, which is based upon his 
responses to the items that have akeady been presented to Lim in requence, out of the set of remaining 
items in the itempool. 

We notice frord (3.4) or (3.6) that this procedure is justified firom the standpoint of criterion-oriented 
validity, for the item which provides us with the greatest item information lg{0) among aU the available 
items in the itempool also gives the greatest values of J^(^) and its square root, at any fixed value of 

e . 

Amount of test information can be used effectively in the stopping rule of the computerised adaptive 
testing. A procedure may be to terminate the presentation of a new item out of the itempool to the 
individual examinee when I{6) has reached an a priori set amount at the current value of his estimated 

e . 

When we have a specific criterion variable 7 in mind, it is justified to use an a priori set value of 
instead of I{6) . In sc doing we can obtsdn the value of I{6) corresponding to the a priori set 
value of /•(f) for each 6 , through the formula 

(3.18) m=n)i^r , 

which is obtained from (3.7). Thus it 13 easy to have the computer to handle this situation, provided 
that we know the functional formula for ${8) . 

IV Test Validity Measures Obtained from More Accurate 
Minimum Variance Bounds 

When {ds/dB} = 0 at some value of 5 , as is illustrated by a dashed line in Figure 2-2, 
becomes positive infinity, and so does the item validity measure given by (3.6). This fact provides us 
with some doubt, for, while we can see that at such a point of ^ item validity is high, we must wonder 
if positive infinity is an adequate measure. It is also obvious from (2.8) that the same will happen to the 
total test if it includes at le^t one such item. Our question is: should we search for more meaningful 
functions than the item and test information functions? This topic will be discussed in this section. 

Necessity of the search for a more accurate measure than the test information function becomes 
more urgent when the performance function, ({B) , is not strictly increasing in S , but is, say, only 
piecewise monotone in B with finite dB/d^ and differentiable with respect to 0 , as is illustrated 
in Figure 4*1. The illustrated performance function is still simple enough, but indicates the trend that 
after a cert^ point of ability the performance level in a specified job decreases. This can happen when 
the job does not provide enough challenge for persons of very high ability levels. 

Since I* [i) serves as the reciprocal of the conditional variance of the maximum likelihood estimate of 
i only asymptotically and there exist more accurate minimum variance bounds for any (asymptotically) 
unbiased estimator (cf. Kendall and Stuart, 1961), we can search for more accurate test validity 
measures than the one given by (3.7) by using the reciprocal of the square roots of such minimum 
variance bounds. Details of this topic wiU be discussed in a separate paper. Here its brief summary 
related to validity measures will be given. 

Let Jrs{B) be defined as 
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where 



(4-2) ^\^) = ^M6) = §;Me) . 

Let J{6) denote the (k x k) matrix of the element Jr«(^) i and Jr»^{B) be the corresponding element 
of its inverse matrix, J" ^(5) ♦ Note tnat when A: = 1 we can rewrite (4.1) into the form 

(4.3) Jkkie) = Jiiie) = E[{~iogLv{e)y\e] 

= -i?(^logJV(^)|5], 

and from this, (2.7) and (2.8) we can see that J{$) id a (1 x 1) matrix whose element is the test 
information function, I{6) , itself. A set of improved minimum variance bounds is given by 

(4.4) EE i^'He) JrZ'ie) &He) 

r=l«=l 

(cf* Kendall and Stuart, 1961), where f(')(5) denotes the s-th partial derivative of f (5) with respect 
to 6 . We obtain, therefore, for a set of new test validity measures 



(4-5) • lEEii'^ JZ'ieilo)) li'h"^' , 

r=l«=l 



where 7^*^ indicates the s-th partiiJ derivative of f with respect to 5 at f = 70 • 

The use of this new test validity measure will ameliorate the problems caused by {di/d6} = 0 , if 
we choose an appropriate k . The resulting algorithm will become much more complicated, however, 
and we must expect a substantially larger amount of CPU time for computing these measures when k 
is greater than unity. Note that (4.5) equals (3*7) when k = 1 . 

V Multidimensional Latent Space 

When our latent space is multidimensional, a generalisation of the idea given in Section 4 for the 
unidimensional latent space can be made straightforwardly. We can write 

(5.1) fi = {<?„}' u=l,2,...,,7 , 

and the performance function i{6) becomes a function of 17 independent variables. A minimnm 
variance bound is given by 

(5.2) ±± m m 



V '^^^"l "^^"l T-i(0\ 



where is the («, v)-element of the inverse matrix of the (»; x »;) symmetric matrix, whose 

element is given by 
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(6.3) 



with L abbreviating Lv{6) , or iV(^) ♦ The reciprocal of the square root of (5.3) will provide U8 
with the counterpart of (3.7) for the multidimensional latent space. For 97 = 2 , the area (1$ may look 
like one of the contours illustrated in Figure 5*1, dep .nding upon our choice of 70 t taking the axis for 
7 vertical to the plane defined by 61 and O2 . 

lu a more complex situation where both ability and the criterion variables are multidimensional, we 
must consider the projection of the item information function on the criterion subspace from the ability 
subspace, in order to have the item validity function for each item, and then the test validity function. 
It is anticipated that we must deal with a higher mathematical complexity in such a case. The situation 
will substantially be simplified, however, if the total set of items consists of several subsets of items, 
each of which measures, exclusively, a single ability dimension and a single criterion dimension. 

VI Discussion and Conclusions 

In contrast to the progressive desolution of the reliability coefficient in clascical mental test theory 
and the replacement by the test information function in latent trait models, the bsue of test validity 
has been more or less neglected in modem mental test theory. The present paper proposes some 
considerations about the validity of a test and of a single item. ESbrt has been focused upon searching 
for measures which are population-free, and which will provide us with local and abundant information 
just as the information functions do in comparii^on with the test reliability coefficient in classical mental 
test theory. In so doing, validity indices for different purposes of testing and also those which are 
tailored for a specific population of examinees are considered. 

The above considerations for the item and test validities may be just part of many possible ap- 
proaches. We may still have a long way to go before we dbcover the most useful measures of the item 
and test validities. The aim of the present paper is rather to provide stimulation so that researchers 
will pursue this topic further, taking different approaches. 
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